---
title: Add files from remote repos to custom models
description: Add files from remote repositories, including Bitbucket, GitHub, GitHub Enterprise, S3, GitLab, and GitLab Enterprise to the models you create in the Custom Model Workshop.

---

# Add files from remote repos to custom models {: #add-files-from-remote-repos-to-custom-models }

If you [add a model](custom-inf-model#create-a-new-custom-model) to the Custom Model Workshop, you can add files to that model from a wide range of repositories, including Bitbucket, GitHub, GitHub Enterprise, S3, GitLab, and GitLab Enterprise. After adding a repository to DataRobot, you can [pull files](#pull-files-from-the-repository) from the repository and include them in the custom model.


## Add a remote repository {: #add-a-remote-repository }

The following steps show how to add a remote repository so that you can pull files into a custom model.

1. Select a custom model you wish to add files to, and navigate to **Assemble** > **Add files** > **Remote repository**.

    ![](images/custom-repo-1.png)

2. Click **add new** to integrate a new remote repository with DataRobot.

    ![](images/custom-repo-2.png)

    See the following topics for next steps to register the repositories:

    * [Bitbucket Server](#bitbucket-server-repository)
    * [GitHub](#github-repository)
    * [GitHub Enterprise](#github-enterprise-repository)
    * [S3](#s3-repository)
    * [GitLab](#gitlab-cloud-repository)
    * [GitLab Enterprise](#gitlab-enterprise-repository)

###  Bitbucket Server repository {: #bitbucket-server-repository }

To register a Bitbucket Server repository:

1. Select **Bitbucket Server** from the list of repositories to be added in step 2 of the [Add a remote repository](#add-a-remote-repository) procedure.

2. Complete the required fields:

    ![](images/remote-3.png)

    | Field | Description |
    |-----------------------|--------------|
    | Name | The name of the Bitbucket Server repository.|
    | Repository location   | The URL for the Bitbucket Server repository that appears in the browser address bar when accessed. Alternatively, select **Clone** from the Bitbucket Server UI and paste the URL.|
    | Personal access token |  The token used to grant DataRobot access to the Bitbucket Server repository. Generate this token from the Bitbucket Server UI by navigating to **Profile > Manage account > Personal access tokens**  and selecting **Create a token**. Name the token, review the permissions, and once created, copy the token string to this field.  |
    | Description |  Optional. A description of the Bitbucket Server repository.|

3. Click **Test** to verify connection to the repository.

4. Once you have verified the connection, click **Add repository**. The Bitbucket Server repository can now be used to [pull files](#pull-files-from-the-repository) for custom models.

###  GitHub repository {: #github-repository }

To register a public GitHub repository:

1. Select **GitHub** from the list of repositories to be added in step 2 of the [Add a remote repository](#add-a-remote-repository) procedure.

2. Authorize the GitHub app by clicking **Authorize GitHub App** and agreeing to grant DataRobot read-only access to your GitHub account's public repositories.

    ![](images/custom-repo-3.png)

    !!! note
        You can also use repositories that are part of any [GitHub organization](#github-organization-repository-access) you belong to.

    !!! tip
        At any time you can **Unauthorize** the app. This revokes access from all of your registered GitHub repositories in DataRobot. All registered repositories will be preserved, but without access to your GitHub repositories. You can re-authorize the app later.

3. Once authorized, complete the required fields:

    ![](images/custom-repo-4.png)

    | Field | Description |
    |-----------------------|--------------|
    | Name | The name of the GitHub repository.|
    | Edit repository permissions | To use a private repository, you need to [grant the GitHub app access](#edit-github-repository-permissions). |
    | Repository | Enter the GitHub repository URL. Start typing the repository name and repositories will populate in the autocomplete dropdown. Notes: <ul><li> When you [grant access to a private repository](#edit-github-repository-permissions), its URL is added to the **Repository** autocomplete dropdown. </li><li>To use an external public GitHub repository, you must [obtain the URL from the repo](#external-github-repositories).</li></ul> |
    | Description | Optional. A description of the GitHub repository. |

4. Click **Test** to verify the repository connection.

5. When validated, select **Add repository**. You can now [pull files](#pull-files-from-the-repository) from the repository to add to a custom model.

#### Edit GitHub repository permissions {: #edit-github-repository-permissions }

To use a private repository, click **Edit repository permissions** in the **Add GitHub repository** window. This gives the GitHub app access to your private repositories. You can give access to:

* All current and future private repositories
* A selected list of repositories

![](images/custom-repo-9.png)

After access is granted, the private repositories appear in the autocomplete dropdown for the **Repository** field.

#### External GitHub repositories {: #external-github-repositories }

To use an external public GitHub repository that is not owned by you or your organization, navigate to the repository in GitHub and click **Code**. Copy and paste the URL into the **Repository** field of the the **Add GitHub repository** window.

![](images/custom-repo-8.png)

#### GitHub organization repository access {: #github-organization-repository-access }

If you belong to a GitHub organization, you can request access to an organization's repository for use with DataRobot. A request for access notifies the GitHub admin, who then who approves or denies your access request.

!!! note
	If your admin approves a single user's access request, access is provided to **all** DataRobot users in that user's organization without any additional configuration. For more information, reference the [GitHub documentation](https://docs.github.com/en/github/setting-up-and-managing-organizations-and-teams/managing-access-to-your-organizations-repositories){ target=_blank }.

###  GitHub Enterprise repository {: #github-enterprise-repository }

To register a GitHub Enterprise repository:

1. Select **GitHub Enterprise** from the list of repositories to be added in step 2 of the [Add a remote repository](#add-a-remote-repository) procedure.

2. Complete the required fields:

    ![](images/remote-5.png)

    | Field | Description |
    |-----|-------|
    | Name | The name of the GitHub Enterprise repository. |
    | Repository location   | The URL for the GitHub Enterprise repository that appears in the browser address bar when accessed. Alternatively, select **Clone** from the GitHub UI and paste the URL.|
    | Personal access token |  The token used to grant DataRobot access to the GitHub Enterprise repository. Generate this token from the GitHub UI by selecting your user icon in the top right and navigating to  **Settings > Developer Settings**  and selecting **Personal access tokens**. Click **Generate new token**. Name the token and select "repo" for the scope of access. Once created, copy the token string to this field.  |
    | Description |  Optional. A description of the GitHub Enterprise repository.|

3. Click **Test** to verify connection to the repository.

4. Once you have verified the connection, click **Add repository**. The GitHub Enterprise repository can now be used to [pull files](#pull-files-from-the-repository) for custom models.

#### Git Large File Storage {: #git-large-file-storage }

Git Large File Storage (LFS) is supported by default for GitHub integrations. Reference the [Git documentation](https://git-lfs.github.com){ target=_blank } to learn more. Git LFS support for GitHub always requires having the GitHub application installed on the target repository, even if it's a public repository. Any non-authorized requests to the LFS API will fail with an HTTP 403.

### S3 repository {: #s3-repository }

To register an S3 repository:

1. Select **S3** from the list of repositories to be added in step 2 of the [Add a remote repository](#add-a-remote-repository) procedure.

2. Complete the required fields. Note that AWS credentials are optional for public buckets.

    ![](images/custom-repo-5.png)

    |  Field  |  Description  |
    |-------|--------------|
    | Name   | The name of the S3 repository. |
    | Bucket name | The name of the S3 bucket. If you are adding a public S3 repository, this is the **only** field you must complete.   |
    | Access key ID | The key used to sign programmatic requests made to AWS. Use with the AWS Secret Access Key to authenticate requests to pull from the S3 repository. Required for private S3 repositories.  |
    | Secret access key | The key used to sign programmatic requests made to AWS. Use with the AWS Access Key ID to authenticate requests to pull from the S3 repository. Required for private S3 repositories.      |
    | Session token  | Optional. A <a target="_blank" href="https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_temp_use-resources.html">token</a> that validates temporary security credentials when making a call to an S3 bucket.       |
    | Description |  Optional. A description of the S3 repository.  |

3. Click **Test** to verify connection to the repository.

4. Once you have verified the connection, click **Add repository**. The S3 repository can now be used to [pull files](#pull-files-from-the-repository) for custom models.

#### AWS S3 access configuration {: #aws-s3-access-configuration }

DataRobot requires the AWS S3 `ListBucket` and `GetObject` permissions in order to ingest data. These permissions should be applied as an additional AWS IAM Policy for the AWS user or role the cluster uses for access. For example, to allow ingestion of data from a private bucket named `examplebucket`, apply the following policy:
```json
    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Effect": "Allow",
          "Action": ["s3:ListBucket"],
          "Resource": ["arn:aws:s3:::examplebucket"]
        },
        {
          "Effect": "Allow",
          "Action": ["s3:GetObject"],
          "Resource": ["arn:aws:s3:::examplebucket/*"]
        }
      ]
    }
```

####  Remove S3 credentials {: #remove-s3-credentials }

You can remove any S3 credentials by editing the repository connection. Select the connection and click **Clear Credentials**.


### GitLab (cloud) repository {: #gitlab-cloud-repository }

To register a GitLab cloud repository:

1. Select **GitLab** from the list of repositories to be added in step 2 of the [Add a remote repository](#add-a-remote-repository) procedure.

2. Authorize the DataRobot GitLab app by clicking **Authorize GitLab app**.

    ![](images/gitlab-1.png)

    !!! tip
        At any time you can **Unauthorize** the app. This revokes access from all of your registered GitLab repositories in DataRobot. All registered repositories will be preserved, but without access to your GitLab repositories. You can re-authorize the app later.

3. Once authorized, complete the required fields:

    ![](images/gitlab-2.png)

    | Field | Description |
    |-----------------------|--------------|
    | Name | The name of the GitLab repository.|
    | Edit repository permissions | To use a private repository, you need to [grant the GitLab app access](#edit-github-repository-permissions). |
    | Repository | Enter the GitLab repository URL. Start typing the repository name and repositories will populate in the autocomplete dropdown. |
    | Description | Optional. A description of the GitLab repository. |

4. Click **Test** to verify the repository connection.

5. When validated, select **Add repository**. You can now [pull files](#pull-files-from-the-repository) from the repository to add to a custom model.


### GitLab Enterprise repository {: #gitlab-enterprise-repository }

To register a GitLab Enterprise repository:

1. Select **GitLab Enterprise** from the list of repositories to be added in step 2 of the [Add a remote repository](#add-a-remote-repository) procedure.

2. Authorize the DataRobot GitLab app by clicking **Authorize GitLab app**.

    ![](images/gitlab-1.png)

    !!! tip
        At any time you can **Unauthorize** the app. This revokes access from all of your registered GitLab repositories in DataRobot. All registered repositories will be preserved, but without access to your GitLab repositories. You can re-authorize the app later.

3. Once authorized, complete the required fields:

    ![](images/gitlab-5.png)

    | Field | Description |
    |-----------------------|--------------|
    | Name | The name of the GitLab repository.|
    | Edit repository permissions | To use a private repository, you need to [grant the GitLab app access](#edit-github-repository-permissions). |
    | Repository location | Enter the GitLab repository URL. Start typing the repository name and repositories will populate in the autocomplete dropdown. |
        | Personal access token | Enter the token used to grant DataRobot access to the GitLab Enterprise repository. [Generate this token](#create-a-personal-access-token-for-GitLab-enterprise) from GitLab. |
    | Description | Optional. A description of the GitLab repository. |

4. Click **Test** to verify the repository connection.

5. When validated, select **Add repository**. You can now [pull files](#pull-files-from-the-repository) from the repository to add to a custom model.

#### Create a personal access token for GitLab Enterprise {: #create-a-personal-access-token-for-GitLab-enterprise }

To create a personal access token:

1. [Navigate to GitLab](https://gitlab.com/-/profile/personal_access_tokens){ target=_blank }.

2. Enter a name for the new token, set the mandatory scopes (`read_api` and `read_repository`), and click **Create personal access token**.

    ![](images/gitlab-3.png)

    The newly generated token appears at the top of the page.

    ![](images/gitlab-4.png)

3. Enter the new token into the **Personal access token** field in the **Add GitLab Enterprise repository** window.


## Pull files from the repository {: #pull-files-from-the-repository }

When you have added a repository to DataRobot, you can pull files from it to build custom models. The following example shows how to pull files from a GitHub repository.

To do so:

1. Navigate to **Assemble** > **Add files** > **Remote repository**.

    ![](images/custom-repo-1.png)

2. Click **Select a remote repository** and choose a repository from the list.

    ![](images/custom-repo-6.png)

    For a GitHub repository:

    ![](images/custom-repo-7.png)

3. Enter the tag, branch, or commit hash from which you want to pull files.

4. Specify the path to the files being pulled.

5. Once specified, click **Pull into model**. The files populate under the **Model** header as part of the custom model.
